CONTENTS |
In the last chapter, we explored half of the Python/C integration picture -- calling C services from Python. This mode lets programmers speed up operations by moving them to C, and utilize external libraries by wrapping them in C extension modules and types. But the inverse can be just as useful -- calling Python from C. By delegating selected components of an application to embedded Python code, we can open them up to onsite changes without having to ship a system's code.
This chapter tells this other half of the Python/C integration tale. It introduces the Python C interfaces that make it possible for programs written in C-compatible languages to run Python program code. In this mode, Python acts as an embedded control language (what some call a "macro" language). Although embedding is mostly presented in isolation here, keep in mind that Python's integration support is best viewed as a whole. A system's structure usually determines an appropriate integration approach: C extensions, embedded code calls, or both. To wrap up, this chapter concludes by discussing a handful of larger integration platforms, such as COM and JPython, that present broader component integration possibilities.
The first thing you should know about Python's embedded-call API is that it is less structured than the extension interfaces. Embedding Python in C may require a bit more creativity on your part than extending; you must pick tools from a general collection of calls to implement the Python integration, rather than coding to a boilerplate structure. The upside of this loose structure is that programs can combine embedding calls and strategies to build up arbitrary integration architectures.
The lack of a more rigid model for embedding is largely the result of a less clear-cut goal. When extending Python, there is a distinct separation for Python and C responsibilities and a clear structure for the integration. C modules and types are required to fit the Python module/type model by conforming to standard extension structures. This makes the integration seamless for Python clients: C extensions look like Python objects and handle most of the work.
But when Python is embedded, the structure isn't as obvious; because C is the enclosing level, there is no clear way to know what model the embedded Python code should fit. C may want to run objects fetched from modules, strings fetched from files or parsed out of documents, and so on. Instead of deciding what C can and cannot do, Python provides a collection of general embedding interface tools, which you use and structure according to your embedding goals.
Most of these tools correspond to tools available to Python programs. Table 20-1 lists some of the more common API calls used for embedding, and their Python equivalents. In general, if you can figure out how to accomplish your embedding goals in pure Python code, you can probably find C API tools that achieve the same results.
C API Call |
Python Equivalent |
---|---|
PyImport_ImportModule |
import module, __import__ |
PyImport_ReloadModule |
reload(module) |
PyImport_GetModuleDict |
sys.modules |
PyModule_GetDict |
module.__dict__ |
PyDict_GetItemString |
dict[key] |
PyDict_SetItemString |
dict[key]=val |
PyDict_New |
dict = {} |
PyObject_GetAttrString |
getattr(obj, attr) |
PyObject_SetAttrString |
setattr(obj, attr, val) |
PyEval_CallObject |
apply(funcobj, argstuple) |
PyRun_String |
eval(exprstr), exec stmtstr |
PyRun_File |
execfile(filename) |
Because embedding relies on API call selection, though, becoming familiar with the Python C API is fundamental to the embedding task. This chapter presents a handful of representative embedding examples and discusses common API calls, but does not provide a comprehensive list of all tools in the API. Once you've mastered the examples here, you'll probably need to consult Python's integration manuals for more details on available calls in this domain. The most recent Python release comes with two standard manuals for C/C++ integration programmers: Extending and Embedding, an integration tutorial; and Python/C API, the Python runtime library reference.
You can find these manuals on the book's CD (view CD-ROM content online at http://examples.oreilly.com/python2), or fetch their most recent releases at http://www.python.org. Beyond this chapter, these manuals are likely to be your best resource for up-to-date and complete Python API tool information.
Before we jump into details, let's get a handle on some of the core ideas in the embedding domain. When this book speaks of "embedded" Python code, it simply means any Python program structure that can be executed from C. Generally speaking, embedded Python code can take a variety of forms:
C programs can represent Python programs as character strings, and run them as either expressions or statements (like eval and exec).
C programs can load or reference Python callable objects such as functions, methods, and classes, and call them with argument lists (like apply).
C programs can execute entire Python program files by importing modules and running script files though the API or general system calls (e.g., popen).
The Python binary library is usually what is physically embedded in the C program; the actual Python code run from C can come from a wide variety of sources:
Code strings might be loaded from files, fetched from persistent databases and shelves, parsed out of HTML or XML files, read over sockets, built or hardcoded in a C program, passed to C extension functions from Python registration code, and so on.
Callable objects might be fetched from Python modules, returned from other Python API calls, passed to C extension functions from Python registration code, and so on.
Code files simply exist as files, modules, and executable scripts.
Registration is a technique commonly used in callback scenarios that we will explore in more detail later in this chapter. But especially for strings of code, there are as many possible sources as there are for C character strings. For example, C programs can construct arbitrary Python code dynamically by building and running strings.
Finally, once you have some Python code to run, you need a way to communicate with it: the Python code may need to use inputs passed in from the C layer, and may want to generate outputs to communicate results back to C. In fact, embedding generally becomes interesting only when the embedded code has access to the enclosing C layer. Usually, the form of the embedded code suggests its communication mediums:
Code strings that are Python expressions return an expression result as their output. Both inputs and outputs can take the form of global variables in the namespace in which a code string is run -- C may set variables to serve as input, run Python code, and fetch variables as the code's result. Inputs and outputs can also be passed with exported C extension calls -- Python code may use C modules or types to get or set variables in the enclosing C layer. Communications schemes are often combined; for instance, C may preassign global names to objects that export state and interface calls to the embedded Python code.[1]
Callable objects may accept inputs as function arguments and produce results as function return values. Passed-in mutable arguments (e.g., lists, dictionaries, class instances) can be used as both input and output for the embedded code -- changes made in Python are retained in objects held by C. Objects can also make use of the global variable and C extension interface techniques described for strings to communicate with C.
Code files can communicate with most of the same techniques as code strings; when run as separate programs, files can also employ IPC techniques.
Naturally, all embedded code forms can also communicate with C using general system-level tools: files, sockets, pipes, and so on. These techniques are generally less direct and slower, though.
As you can probably tell from the preceding overview, there is much flexibility in the embedding domain. To illustrate common embedding techniques in action, this section presents a handful of short C programs that run Python code in one form or another. Most of these examples make use of the simple Python module file shown in Example 20-1.
######################################################### # C runs Python code in this module in embedded mode. # Such a file can be changed without changing the C layer. # There is just standard Python code (C does conversions). # You can also run code in standard modules like string. ######################################################### import string message = 'The meaning of life...' def transform(input): input = string.replace(input, 'life', 'Python') return string.upper(input)
If you know any Python at all, you know that this file defines a string and a function; the function returns whatever it is passed with string substitution and upper-case conversions applied. It's easy to use from Python:
[mark@toy ~/.../PP2E/Integrate/Embed/Basics]$ python >>> import usermod # import a module >>> usermod.message # fetch a string 'The meaning of life...' >>> usermod.transform(usermod.message) # call a function 'THE MEANING OF PYTHON...'
With proper API use, it's not much more difficult to use this module the same way in C.
Perhaps the simplest way to run Python code from C is by calling the PyRun_ SimpleString API function. With it, C programs can execute Python programs represented as C character string arrays. This call is also very limited: all code runs in the same namespace (module __main__ ), the code strings must be Python statements (not expressions), and there is no easy way to communicate inputs or outputs with the Python code run. Still, it's a simple place to start; the C program in Example 20-2 runs Python code to accomplish the same results as the interactive session listed in the prior section.
/******************************************************* * simple code strings: C acts like the interactive * prompt, code runs in __main__, no output sent to C; *******************************************************/ #include <Python.h> /* standard API def */ main( ) { printf("embed-simple\n"); Py_Initialize( ); PyRun_SimpleString("import usermod"); /* load .py file */ PyRun_SimpleString("print usermod.message"); /* on python path */ PyRun_SimpleString("x = usermod.message"); /* compile and run */ PyRun_SimpleString("print usermod.transform(x)"); }
The first thing you should notice here is that when Python is embedded, C programs always call Py_Initializeto initialize linked-in Python libraries before using any other API functions. The rest of this code is straightforward -- C submits hardcoded strings to Python that are roughly what we typed interactively. Internally, PyRun_SimpleString invokes the Python compiler and interpreter to run the strings sent from C; as usual, the Python compiler is always available in systems that contain Python.
To build a standalone executable from this C source file, you need to link its compiled form with the Python library file. In this chapter, "library" usually means the binary library file (e.g., an .a file on Unix) that is generated when Python is compiled, not the Python source code library.
Today, everything about Python you need in C is compiled into a single .a library file when the interpreter is built. The program's main function comes from your C code, and depending on the extensions installed in your Python, you may also need to link any external libraries referenced by the Python library.
Assuming no extra extension libraries are needed, Example 20-3 is a minimal Linux makefile for building the C program in Example 20-2. Again, makefile details vary per platform, but see Python manuals for hints. This makefile uses the Python include-files path to find Python.h in the compile step, and adds the Python library file to the final link step to make API calls available to the C program.
# a linux makefile that builds a C executable that embeds # Python, assuming no external module libs must be linked in; # uses Python header files, links in the Python lib file; # both may be in other dirs (e.g., /usr) in your install; # set MYPY to your Python install tree, change lib version; PY = $(MYPY) PYLIB = $(PY)/libpython1.5.a PYINC = -I$(PY)/Include -I$(PY) embed-simple: embed-simple.o gcc embed-simple.o $(PYLIB) -g -export-dynamic -lm -ldl -o embed-simple embed-simple.o: embed-simple.c gcc embed-simple.c -c -g $(PYINC)
Things may not be quite this simple in practice, though, at least not without some coaxing. The makefile in Example 20-4 is the one I actually used to build all of this section's C programs on Linux.
# build all 5 basic embedding examples # with external module libs linked in; # source setup-pp-embed.csh if needed PY = $(MYPY) PYLIB = $(PY)/libpython1.5.a PYINC = -I$(PY)/Include -I$(PY) LIBS = -L/usr/lib \ -L/usr/X11R6/lib \ -lgdbm -ltk8.0 -ltcl8.0 -lX11 -lm -ldl BASICS = embed-simple embed-string embed-object embed-dict embed-bytecode all: $(BASICS) embed%: embed%.o gcc embed$*.o $(PYLIB) $(LIBS) -g -export-dynamic -o embed$* embed%.o: embed%.c gcc embed$*.c -c -g $(PYINC) clean: rm -f *.o *.pyc $(BASICS) core
This version links in Tkinter libraries because the Python library file it uses was built with Tkinter enabled. You may have to link in arbitrarily many more externals for your Python library, and frankly, chasing down all the linker dependencies can be tedious. Required libraries may vary per platform and Python install, so there isn't a lot of advice I can offer to make this process simple (this is C, after all).
But if you're going to do much embedding work, you might want to build Python on your machine from its source with all unnecessary extensions disabled in the Modules/Setup file. This produces a Python library with minimal external dependencies, which links much more easily. For example, if your embedded code won't be building GUIs, Tkinter can simply be removed from the library; see the Setup file for details. You can also find a list of external libraries referenced from your Python in the generated makefiles located in the Python source tree. In any event, the good news is that you only need to resolve linker dependencies once.
Once you've gotten the makefile to work, run it to build the C program with python libraries linked in. Run the resulting C program as usual:[2]
[mark@toy ~/.../PP2E/Integrate/Embed/Basics]$ embed-simple embed-simple The meaning of life... THE MEANING OF PYTHON...
Most of this output is produced by Python print statements sent from C to the linked-in Python library. It's as if C has become an interactive Python programmer.
However, strings of Python code run by C probably would not be hardcoded in a C program file like this. They might instead be loaded from a text file, extracted from HTML or XML files, fetched from a persistent database or socket, and so on. With such external sources, the Python code strings that are run from C could be changed arbitrarily without having to recompile the C program that runs them. They may even be changed onsite, and by end users of a system. To make the most of code strings, though, we need to move on to more flexible API tools.
Example 20-5 uses the following API calls to run code strings that return expression results back to C:
Py_Initialize initializes linked-in Python libraries as before
PyImport_ImportModule imports a Python module, returns pointer to it
PyModule_GetDict fetches a module's attribute dictionary object
PyRun_String runs a string of code in explicit namespaces
PyObject_SetAttrString assigns an object attribute by name string
PyArg_Parse converts a Python return value object to C form
The import calls are used to fetch the namespace of the usermod module listed in Example 20-1 earlier, so that code strings can be run there directly (and will have access to names defined in that module without qualifications). Py_Import_ImportModule is like a Python import statement, but the imported module object is returned to C, not assigned to a Python variable name. Because of that, it's probably more similar to the Python __import__ built-in function we used in Example 7-32.
The PyRun_String call is the one that actually runs code here, though. It takes a code string, a parser mode flag, and dictionary object pointers to serve as the global and local namespaces for running the code string. The mode flag can be Py_eval_input to run an expression, or Py_file_input to run a statement; when running an expression, the result of evaluating the expression is returned from this call (it comes back as a PyObject* object pointer). The two namespace dictionary pointer arguments allow you to distinguish global and local scopes, but they are typically passed the same dictionary such that code runs in a single namespace.[3]
/* code-strings with results and namespaces */ #include <Python.h> main( ) { char *cstr; PyObject *pstr, *pmod, *pdict; printf("embed-string\n"); Py_Initialize( ); /* get usermod.message */ pmod = PyImport_ImportModule("usermod"); pdict = PyModule_GetDict(pmod); pstr = PyRun_String("message", Py_eval_input, pdict, pdict); /* convert to C */ PyArg_Parse(pstr, "s", &cstr); printf("%s\n", cstr); /* assign usermod.X */ PyObject_SetAttrString(pmod, "X", pstr); /* print usermod.transform(X) */ (void) PyRun_String("print transform(X)", Py_file_input, pdict, pdict); Py_DECREF(pmod); Py_DECREF(pstr); }
When compiled and run, this file produces the same result as its predecessor:
[mark@toy ~/.../PP2E/Integrate/Embed/Basics]$ embed-string embed-string The meaning of life... THE MEANING OF PYTHON...
But very different work goes into producing this output. This time, C fetches, converts, and prints the value of the Python module's message attribute directly by running a string expression, and assigns a global variable (X) within the module's namespace to serve as input for a Python print statement string.
Because the string execution call in this version lets you specify namespaces, you can better partition the embedded code your system runs -- each grouping can have a distinct namespace to avoid overwriting other groups' variables. And because this call returns a result, you can better communicate with the embedded code -- expression results are outputs, and assignments to globals in the namespace in which code runs can serve as inputs.
Before we move on, I need to explain two coding issues here. First of all, this program also decrements the reference count on objects passed to it from Python, using the Py_DECREF call introduced in Chapter 19. These calls are not strictly needed here (the objects' space is reclaimed when the programs exits anyhow), but demonstrate how embedding interfaces must manage reference counts when Python passes their ownership to C. If this was a function called from a larger system, for instance, you would generally want to decrement the count to allow Python to reclaim the objects.
Secondly, in a realistic program, you should generally test the return values of all the API calls in this program immediately to detect errors (e.g., import failure). Error tests are omitted in this section's example to keep the code simple, but will appear in later code listings and should be included in your programs to make them more robust.
The last two sections dealt with running strings of code, but it's easy for C programs to deal in terms of Python objects too. Example 20-6 accomplishes the same task as Examples Example 20-2 and Example 20-5, but uses other API tools to interact with objects in the Python module directly:
PyImport_ImportModule imports the module from C as before
PyObject_GetAttrString fetches an object's attribute value by name
PyEval_CallObject calls a Python function (or class, or method)
PyArg_Parse converts Python objects to C values
Py_BuildValue converts C values to Python objects
We met both the data conversion functions in the last chapter. The PyEval_CallObject call in this version is the key call here: it runs the imported function with a tuple of arguments, much like the Python apply built-in function. The Python function's return value comes back to C as a PyObject*, a generic Python object pointer.
/* fetch and call objects in modules */ #include <Python.h> main( ) { char *cstr; PyObject *pstr, *pmod, *pfunc, *pargs; printf("embed-object\n"); Py_Initialize( ); /* get usermod.message */ pmod = PyImport_ImportModule("usermod"); pstr = PyObject_GetAttrString(pmod, "message"); /* convert string to C */ PyArg_Parse(pstr, "s", &cstr); printf("%s\n", cstr); Py_DECREF(pstr); /* call usermod.transform(usermod.message) */ pfunc = PyObject_GetAttrString(pmod, "transform"); pargs = Py_BuildValue("(s)", cstr); pstr = PyEval_CallObject(pfunc, pargs); PyArg_Parse(pstr, "s", &cstr); printf("%s\n", cstr); /* free owned objects */ Py_DECREF(pmod); Py_DECREF(pstr); Py_DECREF(pfunc); /* not really needed in main( ) */ Py_DECREF(pargs); /* since all memory goes away */ }
When compiled and run, the result is the same again:
[mark@toy ~/.../PP2E/Integrate/Embed/Basics]$ embed-object embed-object The meaning of life... THE MEANING OF PYTHON...
But this output is all generated by C this time -- first by fetching the Python module's message attribute value, and then by fetching and calling the module's transform function object directly and printing its return value that is sent back to C. Input to the transform function is a function argument here, not a preset global variable. Notice that message is fetched as a module attribute this time, instead of by running its name as a code string; there is often more than one way to accomplish the same goals with different API calls.
Running functions in modules like this is a simple way to structure embedding; code in the module file can be changed arbitrarily without having to recompile the C program that runs it. It also provides a direct communication model: inputs and outputs to Python code can take the form of function arguments and return values.
When we used PyRun_String earlier to run expressions with results, code was executed in the namespace of an existing Python module. However, sometimes it's more convenient to create a brand new namespace for running code strings that is independent of any existing module files. The C file in Example 20-7 shows how; the new namespace is created as a new Python dictionary object, and a handful of new API calls are employed in the process:
PyDict_New makes a new empty dictionary object
PyDict_SetItemString assigns to a dictionary's key
PyDict_GetItemString fetches (indexes) a dictionary value by key
PyRun_String runs a code string in namespaces, as before
PyEval_GetBuiltins gets the built-in scope's module
The main trick here is the new dictionary. Inputs and outputs for the embedded code strings are mapped to this dictionary by passing it as the code's namespace dictionaries in the PyRun_String call. The net effect is that the C program in Example 20-7 works exactly like this Python code:
>>> d = {} >>> d['Y'] = 2 >>> exec 'X = 99' in d, d >>> exec 'X = X + Y' in d, d >>> print d['X'] 101
But here, each Python operation is replaced by a C API call.
/*************************************************** * make a new dictionary for code string namespace; ***************************************************/ #include <Python.h> main( ) { int cval; PyObject *pdict, *pval; printf("embed-dict\n"); Py_Initialize( ); /* make a new namespace */ pdict = PyDict_New( ); PyDict_SetItemString(pdict, "__builtins__", PyEval_GetBuiltins( )); PyDict_SetItemString(pdict, "Y", PyInt_FromLong(2)); /* dict['Y'] = 2 */ PyRun_String("X = 99", Py_file_input, pdict, pdict); /* run statements */ PyRun_String("X = X+Y", Py_file_input, pdict, pdict); /* same X and Y */ pval = PyDict_GetItemString(pdict, "X"); /* fetch dict['X'] */ PyArg_Parse(pval, "i", &cval); /* convert to C */ printf("%d\n", cval); /* result=101 */ Py_DECREF(pdict); }
When compiled and run, this C program creates this sort of output:
[mark@toy ~/.../PP2E/Integrate/Embed/Basics]$ embed-dict embed-dict 101
The output is different this time: it reflects the value of Python variable X assigned by the embedded Python code strings and fetched by C. In general, C can fetch module attributes either by calling PyObject_GetAttrString with the module, or by using PyDict_GetItemString to index the module's attribute dictionary (expression strings work too, but are less direct). Here, there is no module at all, so dictionary indexing is used to access the code's namespace in C.
Besides allowing you to partition code string namespaces independent of any Python module files on the underlying system, this scheme provides a natural communication mechanism. Values stored in the new dictionary before code is run serve as inputs, and names assigned by the embedded code can later be fetched out of the dictionary to serve as code outputs. For instance, the variable Y in the second string run refers to a name set to 2 by C; X is assigned by the Python code and fetched later by C code as the printed result.
There is one trick in this code that I need to explain. Each module namespace in Python has a link to the built-in scope's namespace, where names like open and len live. In fact, this is the link Python follows during the last step of its local/global/built-in three-scope name lookup procedure.[4] Today, embedding code is responsible for setting the __builtins__ scope link in dictionaries that serve as namespaces. Python sets this link automatically in all other namespaces that host code execution, and this embedding requirement may be lifted in the future (it seems a bit too magical to be required for long). For now, simply do what this example does to initialize the built-ins link, in dictionaries you create for running code in C.
When you call Python function objects from C, you are actually running the already-compiled bytecode associated with the object (e.g., a function body). When running strings, Python must compile the string before running it. Because compilation is a slow process, this can be a substantial overhead if you run a code string more than once. Instead, precompile the string to a bytecode object to be run later, using the API calls illustrated in Example 20-8:[5]
Py_CompileString compiles a string of code, returns a bytecode object
PyEval_EvalCode runs a compiled bytecode object
The first of these takes the mode flag normally passed to PyRun_String, and a second string argument that is only used in error messages. The second takes two namespace dictionaries. These two API calls are used in Example 20-8 to compile and execute three strings of Python code.
/* precompile code strings to bytecode objects */ #include <Python.h> #include <compile.h> #include <eval.h> main( ) { int i; char *cval; PyObject *pcode1, *pcode2, *pcode3, *presult, *pdict; char *codestr1, *codestr2, *codestr3; printf("embed-bytecode\n"); Py_Initialize( ); codestr1 = "import usermod\nprint usermod.message"; /* statements */ codestr2 = "usermod.transform(usermod.message)"; /* expression */ codestr3 = "print '%d:%d' % (X, X ** 2),"; /* use input X */ /* make new namespace dictionary */ pdict = PyDict_New( ); if (pdict == NULL) return -1; PyDict_SetItemString(pdict, "__builtins__", PyEval_GetBuiltins( )); /* precompile strings of code to bytecode objects */ pcode1 = Py_CompileString(codestr1, "<embed>", Py_file_input); pcode2 = Py_CompileString(codestr2, "<embed>", Py_eval_input); pcode3 = Py_CompileString(codestr3, "<embed>", Py_file_input); /* run compiled bytecode in namespace dict */ if (pcode1 && pcode2 && pcode3) { (void) PyEval_EvalCode((PyCodeObject *)pcode1, pdict, pdict); presult = PyEval_EvalCode((PyCodeObject *)pcode2, pdict, pdict); PyArg_Parse(presult, "s", &cval); printf("%s\n", cval); Py_DECREF(presult); /* rerun code object repeatedly */ for (i = 0; i <= 10; i++) { PyDict_SetItemString(pdict, "X", PyInt_FromLong(i)); (void) PyEval_EvalCode((PyCodeObject *)pcode3, pdict, pdict); } printf("\n"); } /* free referenced objects */ Py_XDECREF(pdict); Py_XDECREF(pcode1); Py_XDECREF(pcode2); Py_XDECREF(pcode3); }
This program combines a variety of technique we've already seen. The namespace in which the compiled code strings run, for instance, is a newly created dictionary (not an existing module object), and inputs for code strings are passed as preset variables in the namespace. When built and executed, the first part of the output is similar to previous examples in this section, but the last line represents running the same precompiled code string 11 times:
[mark@toy ~/.../PP2E/Integrate/Embed/Basics]$ embed-bytecode embed-bytecode The meaning of life... THE MEANING OF PYTHON... 0:0 1:1 2:4 3:9 4:16 5:25 6:36 7:49 8:64 9:81 10:100
If your system executes strings multiple times, it is a major speedup to precompile to bytecode in this fashion.
In examples thus far, C has been running and calling Python code from a standard main program flow of control. That's not always the way programs work, though; in some cases, programs are modeled on an event-driven architecture where code is executed only in response to some sort of event. The event might be an end user clicking a button in a GUI, the operating system delivering a signal, or simply software running an action associated with an entry in a table.
In any event (pun accidental), program code in such an architecture is typically structured as callback handlers -- chunks of code dispatched by event-processing logic. It's easy to use embedded Python code to implement callback handlers in such a system; in fact, the event-processing layer can simply use the embedded-call API tools we saw earlier in this chapter to run Python handlers.
The only new trick in this model is how to make the C layer know what code should be run for each event. Handlers must somehow be registered to C to associate them with future events. In general, there is a wide variety of ways to achieve this code/event association; for instance, C programs can:
Fetch and call functions by event name from one or more module files
Fetch and run code strings associated with event names in a database
Extract and run code associated with event tags in HTML or XML[6]
Run Python code that calls back to C to tell it what should be run
And so on. Really, any place you can associate objects or strings with identifiers is a potential callback registration mechanism. Some of these techniques have advantages all their own. For instance, callbacks fetched from module files support dynamic reloading (as we learned in Chapter 9, reload works on modules and does not update objects held directly). And none of the first three schemes requires users to code special Python programs that do nothing but register handlers to be run later.
It is perhaps more common, though, to register callback handlers with the last approach: letting Python code register handlers with C by calling back to C through extensions interfaces. Although this scheme is not without trade-offs, it can provide a natural and direct model in scenarios where callbacks are associated with a large number of objects.
For instance, consider a GUI constructed by building a tree of widget objects in Python scripts. If each widget object in the tree can have an associated event handler, it may be easier to register handlers by simply calling methods of widgets in the tree. Associating handlers with widget objects in a separate structure such as a module file or HTML file requires extra cross-reference work to keep the handlers in sync with the tree.[7]
The following C and Python files demonstrate the basic coding techniques used to implement explicitly registered callback handlers. The C file in Example 20-9 implements interfaces for registering Python handlers, as well as code to run those handlers in response to events:
The Route_Event function responds to an event by calling a Python function object previously passed from Python to C.
The Register_Handler function saves a passed-in Python function object pointer in a C global variable. Python calls Register_Handler through a simple cregister C extension module created by this file.
To simulate real-world events, the Trigger_Event function can be called from Python through the generated C module to trigger an event.
In other words, this example uses both the embedding and extending interfaces we've already met to register and invoke Python event handler code.
#include <Python.h> #include <stdlib.h> /***********************************************/ /* 1) code to route events to Python object */ /* note that we could run strings here instead */ /***********************************************/ static PyObject *Handler = NULL; /* keep Python object in C */ void Route_Event(char *label, int count) { char *cres; PyObject *args, *pres; /* call Python handler */ args = Py_BuildValue("(si)", label, count); /* make arg-list */ pres = PyEval_CallObject(Handler, args); /* apply: run a call */ Py_DECREF(args); /* add error checks */ if (pres != NULL) { /* use and decref handler result */ PyArg_Parse(pres, "s", &cres); printf("%s\n", cres); Py_DECREF(pres); } } /*****************************************************/ /* 2) python extension module to register handlers */ /* python imports this module to set handler objects */ /*****************************************************/ static PyObject * Register_Handler(PyObject *self, PyObject *args) { /* save Python callable object */ Py_XDECREF(Handler); /* called before? */ PyArg_Parse(args, "O", &Handler); /* one argument? */ Py_XINCREF(Handler); /* add a reference */ Py_INCREF(Py_None); /* return 'None': success */ return Py_None; } static PyObject * Trigger_Event(PyObject *self, PyObject *args) { /* let Python simulate event caught by C */ static count = 0; Route_Event("spam", count++); Py_INCREF(Py_None); return Py_None; } static struct PyMethodDef cregister_methods[] = { {"setHandler", Register_Handler}, /* name, address */ {"triggerEvent", Trigger_Event}, {NULL, NULL} }; void initcregister( ) /* this is called by Python */ { /* on first "import cregister" */ (void) Py_InitModule("cregister", cregister_methods); }
Ultimately, this C file is an extension module for Python, not a standalone C program that embeds Python (though C could just as well be on top). To compile it into a dynamically loaded module file, run the makefile in Example 20-10 on Linux (and use something similar on other platforms). As we learned in the last chapter, the resulting cregister.so file will be loaded when first imported by a Python script if it is placed in a directory on Python's module search path (e.g., ".").
###################################################################### # Builds cregister.so, a dynamically-loaded C extension # module (shareable), which is imported by register.py ###################################################################### PY = $(MYPY) PYINC = -I$(PY)/Include -I$(PY) CMODS = cregister.so all: $(CMODS) cregister.so: cregister.c gcc cregister.c -g $(PYINC) -fpic -shared -o cregister.so clean: rm -f *.pyc $(CMODS)
Now that we have a C extension module set to register and dispatch Python handlers, all we need are some Python handlers. The Python module shown in Example 20-11 defines two callback handler functions and imports the C extension module to register handlers and trigger events.
####################################################### # register for and handle event callbacks from C; # compile C code, and run with 'python register.py' ####################################################### # # C calls these Python functions; # handle an event, return a result # def callback1(label, count): return 'callback1 => %s number %i' % (label, count) def callback2(label, count): return 'callback2 => ' + label * count # # Python calls a C extension module # to register handlers, trigger events # import cregister print '\nTest1:' cregister.setHandler(callback1) for i in range(3): cregister.triggerEvent( ) # simulate events caught by C layer print '\nTest2:' cregister.setHandler(callback2) for i in range(3): cregister.triggerEvent( ) # routes these events to callback2
That's it -- the Python/C callback integration is set to go. To kick off the system, run the Python script; it registers one handler function, forces three events to be triggered, and then changes the event handler and does it again:
[mark@toy ~/.../PP2E/Integration/Mixed/Regist]$ python register.py Test1: callback1 => spam number 0 callback1 => spam number 1 callback1 => spam number 2 Test2: callback2 => spamspamspam callback2 => spamspamspamspam callback2 => spamspamspamspamspam
This output is printed by the C event router function, but its content is the return values of the handler functions in the Python module. Actually, there is something pretty wild going on here under the hood. When Python forces an event to trigger, control flows between languages like this:
From Python to the C event router function
From the C event router function to the Python handler function
Back to the C event router function (where the output is printed)
And finally back to the Python script
That is, we jump from Python to C to Python and back again. Along the way, control passes through both extending and embedding interfaces. When the Python callback handler is running, there are two Python levels active, and one C level in the middle. Luckily, this works; Python's API is reentrant, so you don't need to be concerned about having multiple Python interpreter levels active at the same time. Each level runs different code and operates independently.
In the previous chapter, we saw how to use C++ classes in Python by wrapping them with SWIG. But what about going the other way -- using Python classes from other languages? It turns out that this is really just a matter of applying interfaces already shown.
Recall that Python scripts generate class instance objects by calling class objects as though they were functions. To do it from C (or C++), you simply follow the same steps: import a class from a module (or elsewhere), build an arguments tuple, and call it to generate an instance using the same C API tools you use to call Python functions. Once you've got an instance, you can fetch attributes and methods with the same tools you use to fetch globals out of a module.
To illustrate how this works in practice, Example 20-12 defines a simple Python class in a module that we can utilize from C.
# call this class from C to make objects class klass: def method(self, x, y): return "brave %s %s" % (x, y) # run me from C
This is nearly as simple as it gets, but it's enough to illustrate the basics. As usual, make sure that this module is on your Python search path (e.g., in the current directory, or one listed on your PYTHONPATH setting), or else the import call to access it from C will fail, just as it would in a Python script. Now, here is how you might make use of this Python class from a Python program:
C:\...\PP2E\Integrate\Embed\ApiClients>python >>> import module # import the file >>> object = module.klass( ) # make class instance >>> result = object.method('sir', 'robin') # call class method >>> print result brave sir robin
This is fairly easy stuff in Python. You can do all these operations in C too, but it takes a bit more code. The C file in Example 20-13 implements these steps by arranging calls to the appropriate Python API tools.
#include <Python.h> #include <stdio.h> main( ) { /* run objects with low-level calls */ char *arg1="sir", *arg2="robin", *cstr; PyObject *pmod, *pclass, *pargs, *pinst, *pmeth, *pres; /* instance = module.klass( ) */ Py_Initialize( ); pmod = PyImport_ImportModule("module"); /* fetch module */ pclass = PyObject_GetAttrString(pmod, "klass"); /* fetch module.class */ Py_DECREF(pmod); pargs = Py_BuildValue("( )"); pinst = PyEval_CallObject(pclass, pargs); /* call class( ) */ Py_DECREF(pclass); Py_DECREF(pargs); /* result = instance.method(x,y) */ pmeth = PyObject_GetAttrString(pinst, "method"); /* fetch bound method */ Py_DECREF(pinst); pargs = Py_BuildValue("(ss)", arg1, arg2); /* convert to Python */ pres = PyEval_CallObject(pmeth, pargs); /* call method(x,y) */ Py_DECREF(pmeth); Py_DECREF(pargs); PyArg_Parse(pres, "s", &cstr); /* convert to C */ printf("%s\n", cstr); Py_DECREF(pres); }
Step through this source file for more details; it's merely a matter of figuring out how you would accomplish the task in Python, and then calling equivalent C functions in the Python API. To build this source into a C executable program, run the makefile in the file's directory (it's analogous to makefiles we've already seen). After compiling, run it as you would any other C program:
[mark@toy ~/.../PP2E/Integrate/Embed/ApiClients]$ objects-low brave sir robin
This output might seem anticlimactic, but it actually reflects the return values sent back to C by the class method in file module.py. C did a lot of work to get this little string: it imported the module, fetched the class, made an instance, and fetched and called the instance method, performing data conversions and reference count management every step of the way. In return for all the work, C gets to use the techniques shown in this file to reuse any Python class.
Of course, this example would be more complex in practice. As mentioned earlier, you generally need to check the return value of every Python API call to make sure it didn't fail. The module import call in this C code, for instance, can fail easily if the module isn't on the search path; if you don't trap the NULL pointer result, your program will almost certainly crash when it tries to use the pointer (at least eventually). Example 20-14 is a recoding of Example 20-13 with full error-checking; it's big, but it's robust.
#include <Python.h> #include <stdio.h> #define error(msg) do { printf("%s\n", msg); exit(1); } while (1) main( ) { /* run objects with low-level calls and full error checking */ char *arg1="sir", *arg2="robin", *cstr; PyObject *pmod, *pclass, *pargs, *pinst, *pmeth, *pres; /* instance = module.klass( ) */ Py_Initialize( ); pmod = PyImport_ImportModule("module"); /* fetch module */ if (pmod == NULL) error("Can't load module"); pclass = PyObject_GetAttrString(pmod, "klass"); /* fetch module.class */ Py_DECREF(pmod); if (pclass == NULL) error("Can't get module.klass"); pargs = Py_BuildValue("( )"); if (pargs == NULL) { Py_DECREF(pclass); error("Can't build arguments list"); } pinst = PyEval_CallObject(pclass, pargs); /* call class( ) */ Py_DECREF(pclass); Py_DECREF(pargs); if (pinst == NULL) error("Error calling module.klass( )"); /* result = instance.method(x,y) */ pmeth = PyObject_GetAttrString(pinst, "method"); /* fetch bound method */ Py_DECREF(pinst); if (pmeth == NULL) error("Can't fetch klass.method"); pargs = Py_BuildValue("(ss)", arg1, arg2); /* convert to Python */ if (pargs == NULL) { Py_DECREF(pmeth); error("Can't build arguments list"); } pres = PyEval_CallObject(pmeth, pargs); /* call method(x,y) */ Py_DECREF(pmeth); Py_DECREF(pargs); if (pres == NULL) error("Error calling klass.method"); if (!PyArg_Parse(pres, "s", &cstr)) /* convert to C */ error("Can't convert klass.method result"); printf("%s\n", cstr); Py_DECREF(pres); }
But don't do that . As you can probably tell from the last example, embedded-mode integration code can very quickly become as complicated as extending code for nontrivial use. Today, no automation solution solves the embedding problem as well as SWIG addresses extending. Because embedding does not impose the kind of structure that extension modules and types provide, it's much more of an open-ended problem; what automates one embedding strategy might be completely useless in another.
With a little up-front work, though, you can still automate common embedding tasks by wrapping up calls in a higher-level API. These APIs could handle things such as error detection, reference counts, data conversions, and so on. One such API, ppembed, is available on this book's CD (see http://examples.oreilly.com/python2). It merely combines existing tools in Python's standard C API to provide a set of easier-to-use calls for running Python programs from C.
Example 20-15 demonstrates how to recode objects-err-low.c by linking ppembed's library files with your program.
#include <stdio.h> #include "ppembed.h" main ( ) { /* with ppembed high-level api */ int failflag; PyObject *pinst; char *arg1="sir", *arg2="robin", *cstr; failflag = PP_Run_Function("module", "klass", "O", &pinst, "( )") || PP_Run_Method(pinst, "method", "s", &cstr, "(ss)", arg1, arg2); printf("%s\n", (!failflag) ? cstr : "Can't call objects"); Py_XDECREF(pinst); free(cstr); }
This file uses two ppembed calls (the names that start with "PP") to make the class instance and call its method. Because ppembed handles error checks, reference counts, data conversions, and so on, there isn't much else to do here. When this program is run and linked with ppembed library code, it works like the original, but is much easier to read, write, and debug:
[mark@toy ~/.../PP2E/Integrate/Embed/ApiClients]$ objects-api brave sir robin
The ppembed API provides higher-level calls for most of the embedding techniques we've seen in this chapter. For example, the C program in Example 20-16 runs code strings to make the string module capitalize a simple text.
#include <Python.h> /* standard API defs */ void error(char *msg) { printf("%s\n", msg); exit(1); } main( ) { /* run strings with low-level calls */ char *cstr; PyObject *pstr, *pmod, *pdict; /* with error tests */ Py_Initialize( ); /* result = string.upper('spam') + '!' */ pmod = PyImport_ImportModule("string"); /* fetch module */ if (pmod == NULL) /* for name-space */ error("Can't import module"); pdict = PyModule_GetDict(pmod); /* string.__dict__ */ Py_DECREF(pmod); if (pdict == NULL) error("Can't get module dict"); pstr = PyRun_String("upper('spam') + '!'", Py_eval_input, pdict, pdict); if (pstr == NULL) error("Error while running string"); /* convert result to C */ if (!PyArg_Parse(pstr, "s", &cstr)) error("Bad result type"); printf("%s\n", cstr); Py_DECREF(pstr); /* free exported objects, not pdict */ }
This C program file includes politically correct error tests after each API call. When run, it prints the result returned by running an uppercase conversion call in the namespace of the Python string module:
[mark@toy ~/.../PP2E/Integrate/Embed/ApiClients]$ codestring-low SPAM!
You can implement such integrations by calling Python API functions directly, but you don't necessarily have to. With a higher-level embedding API like ppembed, the task can be noticeably simpler, as shown in Example 20-17.
#include "ppembed.h" #include <stdio.h> /* with ppembed high-level api */ main( ) { char *cstr; int err = PP_Run_Codestr( PP_EXPRESSION, /* expr or stmt? */ "upper('spam') + '!'", "string", /* code, module */ "s", &cstr); /* expr result */ printf("%s\n", (!err) ? cstr : "Can't run string"); /* and free(cstr) */ }
When linked with the ppembed library code, this version produces the same result as the former. Like most higher-level APIs, ppembed makes some usage mode assumptions that are not universally applicable; when they match the embedding task at hand, though, such wrapper calls can cut much clutter from programs that need to run embedded Python code.
Embedded Python code can do useful work as well. For instance, the C program in Example 20-18 calls ppembed functions to run a string of Python code fetched from a file that performs validation tests on inventory data. To save space, I'm not going list all the components used by this example (though you can find them at http://examples.oreilly.com/python2). Still, this file shows the embedding portions relevant to this chapter: it sets variables in the Python code's namespace to serve as input, runs the Python code, and then fetches names out of the code's namespace as results.[8]
/* run embedded code-string validations */ #include <ppembed.h> #include <stdio.h> #include <string.h> #include "ordersfile.h" run_user_validation( ) { /* python is initialized automatically */ int i, status, nbytes; /* caveat: should check status everywhere */ char script[4096]; /* caveat: should malloc a big-enough block */ char *errors, *warnings; FILE *file; file = fopen("validate1.py", "r"); /* customizable validations */ nbytes = fread(script, 1, 4096, file); /* load python file text */ script[nbytes] = '\0'; status = PP_Make_Dummy_Module("orders"); /* application's own namespace */ for (i=0; i < numorders; i++) { /* like making a new dictionary */ printf("\n%d (%d, %d, '%s')\n", i, orders[i].product, orders[i].quantity, orders[i].buyer); PP_Set_Global("orders", "PRODUCT", "i", orders[i].product); /* int */ PP_Set_Global("orders", "QUANTITY", "i", orders[i].quantity); /* int */ PP_Set_Global("orders", "BUYER", "s", orders[i].buyer); /* str */ status = PP_Run_Codestr(PP_STATEMENT, script, "orders", "", NULL); if (status == -1) { printf("Python error during validation.\n"); PyErr_Print( ); /* show traceback */ continue; } PP_Get_Global("orders", "ERRORS", "s", &errors); /* can split */ PP_Get_Global("orders", "WARNINGS", "s", &warnings); /* on blanks */ printf("errors: %s\n", strlen(errors)? errors : "none"); printf("warnings: %s\n", strlen(warnings)? warnings : "none"); free(errors); free(warnings); PP_Run_Function("inventory", "print_files", "", NULL, "( )"); } } main(int argc, char **argv) /* C is on top, Python is embedded */ { /* but Python can use C extensions too */ run_user_validation( ); /* don't need sys.argv in embedded code */ }
There are a couple of things worth noticing here. First of all, in practice this program might fetch the Python code file's name or path from configurable shell variables; here, it is loaded from the current directory. Secondly, you could also code this program by using straight API calls instead of ppembed, but each of the "PP" calls here would then grow into a chunk of more complex code. As coded, you can compile and link this file with Python and ppembed library files to build a program. The Python code run by the resulting C program lives in Example 20-19; it uses preset globals and is assumed to set globals to send result strings back to C.
# embedded validation code, run from C # input vars: PRODUCT, QUANTITY, BUYER # output vars: ERRORS, WARNINGS import string # all python tools are available to embedded code import inventory # plus C extensions, Python modules, classes,.. msgs, errs = [], [] # warning, error message lists def validate_order( ): if PRODUCT not in inventory.skus( ): # this function could be imported errs.append('bad-product') # from a user-defined module too elif QUANTITY > inventory.stock(PRODUCT): errs.append('check-quantity') else: inventory.reduce(PRODUCT, QUANTITY) if inventory.stock(PRODUCT) / QUANTITY < 2: msgs.append('reorder-soon:' + `PRODUCT`) first, last = BUYER[0], BUYER[1:] # code is changeable on-site: if first not in string.uppercase: # this file is run as one long errs.append('buyer-name:' + first) # code-string, with input and if BUYER not in inventory.buyers( ): # output vars used by the C app msgs.append('new-buyer-added') inventory.add_buyer(BUYER) validate_order( ) ERRORS = string.join(errs) # add a space between messages WARNINGS = string.join(msgs) # pass out as strings: "" == none
Don't sweat the details in this code; some components it uses are not listed here either (see http://examples.oreilly.com/python2 for the full implementation). The thing you should notice, though, is that this code file can contain any kind of Python code -- it can define functions and classes, use sockets and threads, and so on. When you embed Python, you get a full-featured extension language for free. Perhaps even more importantly, because this file is Python code, it can be changed arbitrarily without having to recompile the C program. Such flexibility is especially useful after a system has been shipped and installed.
As discussed earlier, there is a variety of ways to structure embedded Python code. For instance, you can implement similar flexibility by delegating actions to Python functions fetched from module files, as illustrated in Example 20-20.
/* run embedded module-function validations */ #include <ppembed.h> #include <stdio.h> #include <string.h> #include "ordersfile.h" run_user_validation( ) { int i, status; /* should check status everywhere */ char *errors, *warnings; /* no file/string or namespace here */ PyObject *results; for (i=0; i < numorders; i++) { printf("\n%d (%d, %d, '%s')\n", i, orders[i].product, orders[i].quantity, orders[i].buyer); status = PP_Run_Function( /* validate2.validate(p,q,b) */ "validate2", "validate", "O", &results, "(iis)", orders[i].product, orders[i].quantity, orders[i].buyer); if (status == -1) { printf("Python error during validation.\n"); PyErr_Print( ); /* show traceback */ continue; } PyArg_Parse(results, "(ss)", &warnings, &errors); printf("errors: %s\n", strlen(errors)? errors : "none"); printf("warnings: %s\n", strlen(warnings)? warnings : "none"); Py_DECREF(results); /* ok to free strings */ PP_Run_Function("inventory", "print_files", "", NULL, "( )"); } } main(int argc, char **argv) { run_user_validation( ); }
The difference here is that the Python code file (shown in Example 20-21) is imported, and so must live on the Python module search path. It also is assumed to contain functions, not a simple list of statements. Strings can live anywhere -- files, databases, web pages, and so on, and may be simpler for end users to code. But assuming that the extra requirements of module functions are not prohibitive, functions provide a natural communication model in the form of arguments and return values.
# embedded validation code, run from C # input = args, output = return value tuple import string import inventory def validate(product, quantity, buyer): # function called by name msgs, errs = [], [] # via mod/func name strings first, last = buyer[0], buyer[1:] if first not in string.uppercase: errs.append('buyer-name:' + first) if buyer not in inventory.buyers( ): msgs.append('new-buyer-added') inventory.add_buyer(buyer) validate_order(product, quantity, errs, msgs) # mutable list args return string.join(msgs), string.join(errs) # use "(ss)" format def validate_order(product, quantity, errs, msgs): if product not in inventory.skus( ): errs.append('bad-product') elif quantity > inventory.stock(product): errs.append('check-quantity') else: inventory.reduce(product, quantity) if inventory.stock(product) / quantity < 2: msgs.append('reorder-soon:' + `product`)
The ppembed API originally appeared as an example in the first edition of this book. Since then, it has been utilized in real systems and become too large to present here in its entirety. For instance, ppembed also supports debugging embedded code (by routing it to the pdb debugger module), dynamically reloading modules containing embedded code, and other features too complex to illustrate usefully here.
But if you are interested in studying another example of Python embedding calls in action, ppembed's full source code and makefile live in this directory on the enclosed CD (see http://examples.oreilly.com/python2):
As a sample of the kinds of tools you can build to simplify embedding, the ppembed API's header file is shown in Example 20-22. You are invited to study, use, copy, and improve its code as you like. Or simply write an API of your own; the main point to take from this section is that embedding programs need only be complicated if you stick with the Python runtime API as shipped. By adding convenience functions such as those in ppembed, embedding can be as simple as you make it. It also makes your C programs immune to changes in the Python C core; ideally, only the API must change if Python ever does.
Be sure to also see file abstract.h in the Python include directory if you are in the market for higher-level interfaces. That file provides generic type operation calls that make it easy to do things like creating, filling, indexing, slicing, and concatenating Python objects referenced by pointer from C. Also see the corresponding implementation file, abstract.c, as well as the Python built-in module and type implementations in the Python source distribution for more examples of lower-level object access. Once you have a Python object pointer in C, you can do all sorts of type-specific things to Python inputs and outputs.
/************************************************************************* * PPEMBED, VERSION 2.0 * AN ENHANCED PYTHON EMBEDDED-CALL INTERFACE * * Wraps Python's run-time embedding API functions for easy use. * Most utilities assume the call is qualified by an enclosing module * (namespace). The module can be a file-name reference or a dummy module * created to provide a namespace for file-less strings. These routines * automate debugging, module (re)loading, input/output conversions, etc. * * Python is automatically initialized when the first API call occurs. * Input/output conversions use the standard Python conversion format * codes (described in the C API manual). Errors are flagged as either * a -1 int, or a NULL pointer result. Exported names use a PP_ prefix * to minimize clashes; names in the built-in Python API use Py prefixes * instead (alas, there is no "import" equivalent in C, just "from*"). * Also note that the varargs code here may not be portable to certain * C compilers; to do it portably, see the text or file 'vararg.txt' * here, or search for string STDARG in Python's source code files. * * New in this version/edition: names now have a PP_ prefix, files * renamed, compiles to a single .a file, fixed pdb retval bug for * strings, and char* results returned by the "s" convert code now * point to new char arrays which the caller should free( ) when no * longer needed (this was a potential bug in prior version). Also * added new API interfaces for fetching exception info after errors, * precompiling code strings to byte code, and calling simple objects. * * Also fully supports Python 1.5 module package imports: module names * in this API can take the form "package.package.[...].module", where * Python maps the package names to a nested directories path in your * file system hierarchy; package dirs all contain __init__.py files, * and the leftmost one is in a directory found on PYTHONPATH. This * API's dynamic reload feature also works for modules in packages; * Python stores the full path name in the sys.modules dictionary. * * Caveats: there is no support for advanced things like threading or * restricted execution mode here, but such things may be added with * extra Python API calls external to this API (see the Python/C API * manual for C-level threading calls; see modules rexec and bastion * in the library manual for restricted mode details). For threading, * you may also be able to get by with C threads and distinct Python * namespaces per Python code segments, or Python language threads * started by Python code run from C (see the Python thread module). * * Note that Python can only reload Python modules, not C extensions, * but it's okay to leave the dynamic reload flag on even if you might * access dynamically-loaded C extension modules--in 1.5.2, Python * simply resets C extension modules to their initial attribute state * when reloaded, but doesn't actually reload the C extension file. *************************************************************************/ #ifndef PPEMBED_H #define PPEMBED_H #ifdef __cplusplus extern "C" { /* a C library, but callable from C++ */ #endif #include <stdio.h> #include <Python.h> extern int PP_RELOAD; /* 1=reload py modules when attributes referenced */ extern int PP_DEBUG; /* 1=start debugger when string/function/member run */ typedef enum { PP_EXPRESSION, /* which kind of code-string */ PP_STATEMENT /* expressions and statements differ */ } PPStringModes; /***************************************************/ /* ppembed-modules.c: load,access module objects */ /***************************************************/ extern char *PP_Init(char *modname); extern int PP_Make_Dummy_Module(char *modname); extern PyObject *PP_Load_Module(char *modname); extern PyObject *PP_Load_Attribute(char *modname, char *attrname); extern int PP_Run_Command_Line(char *prompt); /**********************************************************/ /* ppembed-globals.c: read,write module-level variables */ /**********************************************************/ extern int PP_Convert_Result(PyObject *presult, char *resFormat, void *resTarget); extern int PP_Get_Global(char *modname, char *varname, char *resfmt, void *cresult); extern int PP_Set_Global(char *modname, char *varname, char *valfmt, ... /*val*/); /***************************************************/ /* ppembed-strings.c: run strings of Python code */ /***************************************************/ extern int /* run C string of code */ PP_Run_Codestr(PPStringModes mode, /* code=expr or stmt? */ char *code, char *modname, /* codestr, modnamespace */ char *resfmt, void *cresult); /* result type, target */ extern PyObject* PP_Debug_Codestr(PPStringModes mode, /* run string in pdb */ char *codestring, PyObject *moddict); extern PyObject * PP_Compile_Codestr(PPStringModes mode, char *codestr); /* precompile to bytecode */ extern int PP_Run_Bytecode(PyObject *codeobj, /* run a bytecode object */ char *modname, char *resfmt, void *restarget); extern PyObject * /* run bytecode under pdb */ PP_Debug_Bytecode(PyObject *codeobject, PyObject *moddict); /*******************************************************/ /* ppembed-callables.c: call functions, classes, etc. */ /*******************************************************/ extern int /* mod.func(args) */ PP_Run_Function(char *modname, char *funcname, /* func|classname */ char *resfmt, void *cresult, /* result target */ char *argfmt, ... /* arg, arg... */ ); /* input arguments*/ extern PyObject* PP_Debug_Function(PyObject *func, PyObject *args); /* call func in pdb */ extern int PP_Run_Known_Callable(PyObject *object, /* func|class|method */ char *resfmt, void *restarget, /* skip module fetch */ char *argfmt, ... /* arg,.. */ ); /**************************************************************/ /* ppembed-attributes.c: run object methods, access members */ /**************************************************************/ extern int PP_Run_Method(PyObject *pobject, char *method, /* uses Debug_Function */ char *resfmt, void *cresult, /* output */ char *argfmt, ... /* arg, arg... */ ); /* inputs */ extern int PP_Get_Member(PyObject *pobject, char *attrname, char *resfmt, void *cresult); /* output */ extern int PP_Set_Member(PyObject *pobject, char *attrname, char *valfmt, ... /* val, val... */ ); /* input */ /**********************************************************/ /* ppembed-errors.c: get exception data after api error */ /**********************************************************/ extern void PP_Fetch_Error_Text( ); /* fetch (and clear) exception */ extern char PP_last_error_type[]; /* exception name text */ extern char PP_last_error_info[]; /* exception data text */ extern char PP_last_error_trace[]; /* exception traceback text */ extern PyObject *PP_last_traceback; /* saved exception traceback object */ #ifdef __cplusplus } #endif #endif (!PPEMBED_H)
While writing this chapter, I ran out of space before I ran out of examples. Besides the ppembed API example described in the last section, you can find a handful of additional Python/C integration self-study examples on this book's CD (see http://examples.oreilly.com/python2):
The full implementation of the validation examples listed earlier. This case study uses the ppembed API to run embedded Python order validations, both as embedded code strings and as functions fetched from modules. The inventory is implemented with and without shelves and pickle files for data persistence.
A tool for exporting C variables for use in embedded Python programs.
A simple ppembed test program, shown with and without package import paths to identify modules.
Some of these are large C examples that are probably better studied than listed.
In this book, the term integration has largely meant mixing Python with components written in C or C++ (or other C-compatible languages) in extending and embedding modes. But from a broader perspective, integration also includes any other technology that lets us mix Python components into larger systems. This last section briefly looks at a handful of integration technologies beyond the C API tools we've seen in this part of the book.
We met JPython in Chapter 15but it is worth another mention in the context of integration at large. As we saw earlier, JPython supports two kinds of integration:
JPython uses Java's reflection API to allow Python programs to call out to Java class libraries automatically (extending). The Java reflection API provides Java type information at runtime, and serves the same purpose as the glue code we've generated to plug C libraries into Python in this part of the book. In JPython, however, this runtime type information allows largely automated resolution of Java calls in Python scripts -- no glue code has to be written or generated.
JPython also provides a Java PythonInterpreter class API that allows Java programs to run Python code in a namespace (embedding), much like the C API tools we've used to run Python code strings from C programs. In addition, because JPython implements all Python objects as instances of a Java PyObject class, it is straightforward for the Java layer that encloses embedded Python code to process Python objects.
In other words, JPython allows Python to be both extended and embedded in Java, much like the C integration strategies we've seen in this part of the book. With the addition of the JPython system, Python may be integrated with any C-compatible program by using C API tools, as well as any Java-compatible program by using JPython.
Although JPython provides a remarkably seamless integration model, Python code runs slower in the JPython implementation, and its reliance on Java class libraries and execution environments introduces Java dependencies that may be a concern in some development scenarios. See Chapter 15 for more JPython details; for the full story, read the documentation available online at http://www.jython.org (also available in the JPython package at http://examples.oreilly.com/python2).
We briefly discussed Python's support for the COM object model on Windows when we explored Active Scripting in Chapter 15, but it's really a general integration tool that is useful apart from the Internet too.
Recall that COM defines a standard and language-neutral object model with which components written in a variety of programming languages may integrate and communicate. Python's win32all Windows extension package tools allow Python programs to implement both server and client in the COM interface model.
As such, it provides a powerful way to integrate Python programs with programs written in other COM-aware languages such as Visual Basic, Delphi, Visual C++, PowerBuilder, and even other Python programs. Python scripts can also use COM calls to script popular Microsoft applications such as Word and Excel, since these systems register COM object interfaces of their own. Moreover, the newcomer Python implementation (tentatively called Python.NET) for Microsoft's C#/.NET technology mentioned in Chapter 15 provides another way to mix Python with other Windows components.
On the downside, COM implies a level of dispatch indirection and is a Windows-only solution at this writing. Because of that, it is not as fast or as portable as some of the lower-level integration schemes we've studied in this part of the book (linked-in, in-process, and direct calls between Python and C-compatible language components). For nontrivial use, COM is also considered to be a large system, and further details about it are well beyond the scope of this book.
For more information on COM support and other Windows extensions, refer to Chapter 15 in this book, and to O'Reilly's Python Programming on Win32. That book also describes how to use Windows compilers to do Python/C integration in much more detail than is possible here; for instance, it shows how to use Visual C++ tools to compile and link Python C/C++ integration layer code. The basic C code behind low-level extending and embedding on Windows is the same as shown in this book, but compiling and linking details vary.
There is also much support, some of it open source, for using Python in the context of a CORBA-based application. CORBA stands for the Common Object Request Broker; it's a language-neutral way to distribute systems among communicating components, which speak through an object model architecture. As such, it represents another way to integrate Python components into a larger system.
Python's CORBA support includes the public domain systems ILU (from Xerox) and fnorb (see http://www.python.org). At this writing, the OMG (Object Management Group, responsible for directing CORBA growth) is also playing host to an effort to elect Python as the standard scripting language for CORBA-based systems. Whether that ultimately transpires or not, Python is an ideal language for programming distributed objects, and is being used in such a role by many companies around the world.
Like COM, CORBA is a large system -- too large for us to even scratch the surface in this text. For more details, search Python's web site for CORBA-related materials.
Given so many integration options, choosing between them can be puzzling. When should you choose something like COM over writing C extension modules, for instance? As usual, it depends on why you're interested in mixing external components into your Python programs in the first place.
Basically, frameworks such as JPython, COM, and CORBA allow Python scripts to leverage existing libraries of software components, and do a great job of addressing goals like code reuse and integration. However, they say almost nothing about optimization: integrated components are not necessarily faster than the Python equivalents.
On the other hand, Python extension modules and types coded in a compiled language like C serve two roles: they too can be used to integrate existing components, but also tend to be a better approach when it comes to boosting system performance.
Let's consider the big picture here. Frameworks such as COM and CORBA can perhaps be understood as alternatives to the Python/C integration techniques we met in this part of the book. For example, packaging Python logic as a COM server makes it available for something akin to embedding -- many languages (including C) can access it using the COM client-side interfaces we met in Chapter 15. And as we saw earlier, JPython allows Java to embed and run Python code and objects through a Java class interface.
Furthermore, frameworks allow Python scripts to use existing component libraries: standard Java class libraries in JPython, COM server libraries on Windows, and so on. In such a role, the external libraries exposed by such frameworks are more or less analogous to Python extension modules. For instance, Python scripts that use COM client interfaces to access an external object are acting much like importers of C extension modules (albeit through the COM indirection layer).
Python's C API is designed to serve in many of the same roles. As we've seen, C extension modules can serve as code reuse and integration tools too -- it's straightforward to plug existing C and C++ libraries into Python with SWIG. In most cases, we simply generate and import the glue code created with SWIG to make almost any existing compiled library available for use in Python scripts[9]. Moreover, Python's embedding API allows other languages to run Python code, much like client-side interfaces in COM.
One of the primary reasons for writing C extension modules in the first place, though, is optimization: key parts of Python applications may be implemented or recoded as C or C++ extension modules to speed up the system at large (as in the last chapter's stack examples). Moving such components to compiled extension modules not only improves system performance, but is completely seamless -- module interfaces in Python look the same no matter what programming language implements the module.
By contrast, JPython, COM, and CORBA do not deal directly with optimization goals at all; they serve only to integrate. For instance, JPython allows Python scripts to automatically access Java libraries, but generally mandates that non-Python extensions be coded in Java,a language that is itself usually interpreted and no speed demon. COM and CORBA focus on the interfaces between components and leave the component implementation language ambiguous by design. Exporting a Python class as a COM server, for instance, can make its tools widely reusable on Windows, but has little to do with performance improvement.
Because of their different focus, frameworks are not quite replacements for the more direct Python/C extension modules and types we've studied in these last two chapters, and are less direct (and hence likely slower) than Python's C embedding API. It's possible to mix-and-match approaches, but the combinations are rarely any better than their parts. For example, although C libraries can be added to Java with its native call interface, it's neither a secure nor straightforward undertaking. And while C libraries can also be wrapped as COM servers to make them visible to Python scripts on Windows, the end result will probably be slower and no less complex than a more directly linked-in Python extension module.
As you can see, there are a lot of options in the integration domain. Perhaps the best parting advice I can give you is simply that different tools are meant for different tasks. C extension modules and types are ideal at optimizing systems and integrating libraries, but frameworks offer other ways to integrate components -- JPython for mixing in Java tools, COM for reusing and publishing objects on Windows, and so on. As ever, your mileage may vary.
[1] If you want an example, flip back to the discussion of Active Scripting in Chapter 15. This system fetches Python code embedded in an HTML web page file, assigns global variables in a namespace to objects that give access to the web browser's environment, and runs the Python code in the namespace where the objects were assigned. I recently worked on a project where we did something similar, but Python code was embedded in XML documents, and objects preassigned to globals in the code's namespace represented widgets in a GUI.
[2] My build environment is a little custom (really, odd), so I first need to source $PP2E/Config/setup-pp-embed.csh to set up PYTHONPATH to point to the source library directory of a custom Python build on my machine. In Python 1.5.2., at least, Python may have trouble locating standard library directories when it is embedded, especially if there are multiple Python installs on the same machine (e.g., the interpreter and library versions may not match). This probably won't be an issue in your build environment, but see the sourced file's contents for more details if you get startup errors when you try to run a C program that embeds Python. You may need to customize your login scripts or source such a setup configuration file before running the embedding examples, but only if your Python lives in dark places.
[3] A related function lets you run files of code but is not demonstrated in this chapter: PyObject* PyRun_File(FILE *fp, char *filename, mode, globals, locals). Because you can always load a file's text and run it as a single code string with PyRun_String, the PyRun_File call is not always necessary. In such multiline code strings, the \n character terminates lines and indentation groups blocks as usual.
[4] This link also plays a part in Python's restricted-execution mode, described in Chapter 15. By changing the built-in scope link to a module with limited attribute sets and customized versions of built-in calls like open, the rexec module can control machine access from code run through its interface.
[5] Just in case you flipped ahead to this chapter early: bytecode is simply an intermediate representation for already compiled program code in the current standard Python implementation. It's a low-level binary format that can be quickly interpreted by the Python runtime system. Bytecode is usually generated automatically when you import a module, but there may be no notion of an import when running raw strings from C.
[6] And if C chooses to do so, it might even run embedded Python code that uses Python's standard HTML and XML processing tools to parse out the embedded code associated with an event tag. See the Python library manual for details on these parsers.
[7] If you're looking for a more realistic example of Python callback handlers, see the TkinterGUI system used extensively in this book. Tkinter uses both extending and embedding. Its extending interface (widget objects) is used to register Python callback handlers, which are later run with embedding interfaces in response to GUI events. You can study Tkinter's implementation in the Python source distribution for more details, though its Tk library interface logic makes it a somewhat challenging read.
[8] This is more or less the kind of structure used when Python is embedded in HTML files in the Active Scripting extension, except that the globals set here (e.g., PRODUCT) become names preset to web browser objects, and the code is extracted from a web page, not fetched from a text file with a known name. See Chapter 15.
[9] In fact, it's so easy to plug in libraries with SWIG that extensions are usually best coded first as simple C/C++ libraries, and later wrapped for use in Python with SWIG. Adding a COM layer to an existing C library may or may not be as straightforward, but will clearly be less portable -- COM is currently a Windows-only technology.
CONTENTS |